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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: PEPTIDE THERAPEUTICS LIMITED 

(B) STREET: 100 Fulbourn Road 

(C) CITY: Cambridge 

10 (D) STATE: not applicable 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP) : CBl 9PT 

(ii) TITLE OF INVENTION: ATTENUATED BACTERIA USEFUL IN VACCINES 
(iii) NUMBER OF SEQUENCES: 6 

(1v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0. Version #1.30 (EPO) 

(V) CURRENT APPLICATION DATA: 
25 APPLICATION NUMBER: 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(b) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 
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wo 99/49026 PCT/GB99/00935 
(A) ORGANISM: aroC of E.coli 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
5 (B) LOCATION: 492.. 1562 

(x1) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GTCGACGCGG TGGATATCTC TCCAGACGCG CTGGCGGTTG CTGAACAGAA CATCGAAGAA 60 

10 CACGGTCTGA TCCACAACGT CATTCCGATT CGTTCC6ATC TGTTCCGCGA CTT6CCGAAA 120 

GTGCAGTACG ACCTGATTGT CACTAACCCG CCGTATGTCG ATGCGAA6AT ATGTCCGACC 180 

TGCCAAACAA TACCGCCACG AGCCGGAACT GGGCCTGGCA TCTGGCACTG ACGGCCT6AA 240 

ACTGACGCGT CGCATTCTCG GTAACGCGGC AGATTACCTT GCTGATGATG GCGTGTTGAT 300 

p TTGTGAAGTC GGCAACAGCA TGGTACATCT TATGGAACAA TATCCGGATG TTCCGTTCAC 360 

C= 15 CTGGCTGGAG TTTGATAACG GCGGCGATGG TGTGTTTATG CTCACCAAAG AGCAGCTTAT 420 

^ TGCCGCACGA GAACATTTCG CGATTTATAA AGATTAAGTA AACACGCAAA CACAACAATA 480 

m ACGGAGCCGT G ATG GCT GGA AAC ACA ATT GGA CAA CTC TTT CGC GTA ACC 530 
r\ Met Ala Gly Asn Thr He Gly Gin Leu Phe Arg Val Thr 

m 1 5 10 



20 

ACC TTC GGC GAA TCG CAC 66G CTG GCG CTC GGC TGC ATC GTC GAT GGT 578 
Thr Phe Gly Glu Ser His Gly Leu Ala Leu Gly Cys He Val Asp Gly 
15 20 25 

25 GTT CCG CCA GGC ATT CCG CTG ACG GAA GCG GAC CTG CAA CAT GAC CTC 626 
Val Pro Pro Gly He Pro Leu Thr Glu Ala Asp Leu Gin His Asp Leu 
30 35 40 45 

GAC CGT CGT CGC CCT GGG ACA TCG CGC TAT ACC ACC CAG CGC CGC GAG 674 
30 Asp Arg Arg Arg Pro Gly Thr Ser Arg Tyr Thr Thr Gin Arg Arg Glu 

50 55 - 60 . 

CCG GAT CAG GTC AAA ATT CTC TCC GGT GTT TTT GAA GGC GTT ACT ACC 722 
Pro Asp Gin Val Lys He Leu Ser Gly Val Phe Glu Gly Val Thr Thr 
35 65 70 75 

GGC ACC AGC ATT GGC TTG TTG ATC GAA AAC ACT GAC CAG CGC TCT CAG 770 
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Gly Thr Ser lie Gly Leu Leu lie Glu Asn Thr Asp Gin Arg Ser Gin 
80 85 90 

GAT TAC AGT GCG ATT AAG GAC GTT TTC CGT CCA GGC CAT GCC GAT TAC 818 
5 Asp Tyr Ser Ala He Lys Asp Val Phe Arg Pro Gly His Ala Asp Tyr 
95 100 105 

ACC TAC GAA CAA AAA TAC GGT CTG CGC GAT TAT CGC GGC GGT GGA CGT 866 
Thr Tyr Glu Gin Lys Tyr Gly Leu Arg Asp Tyr Arg Gly Gly Gly Arg 
10 110 115 120 125 

TCT TCC GCC CGC GAA ACC GCC ATG CGC GTG GCG 6CA GGA GCT ATT GCC 914 

Ser Ser Ala Arg Glu Thr Ala Met Arg Val Ala Ala Gly Ala He Ala 
130 135 140 

15 

AAA AAA TAT CTC GCC GAG AAA TTT GGT ATT GAA ATC CGT GGC TGC CTG 962 

Lys Lys Tyr Leu Ala Glu Lys Phe Gly He Glu He Arg Gly Cys Leu 
145 150 155 

2 0 ACC CAG ATG GGC GAC ATT CCG CTG GAT ATC AAA GAC TGG TCG CAG GTC 1010 
Thr Gin Met Gly Asp He Pro Leu Asp He Lys Asp Trp Ser Gin Val 
160 165 170 

GAG CAA AAT CCG TTT TTT TGC CCG GAC CCC GAC AAA ATC GAC GCG TTA 1058 
25 Glu Gin Asn Pro Phe Phe Cys Pro Asp Pro Asp Lys He Asp Ala Leu 
175 180 185 

GAC GAG TTG ATG CGT GCG CTG AAA AAA GAG GGC GAC TCC ATC GGC GCT 1106 
Asp Glu Leu Met Arg Ala Leu Lys Lys Glu Gly Asp Ser He Gly Ala 
30 190 195 200 205 

AAA GTC ACC GTT GTT GCC AGT GGC GTT CCT GCC GGA CTT GGC GAG CCG 1154 
Lys Val Thr Val Va1 Ala Ser Gly Val Pro Ala Gly Leu Gly Glu Pro 
210 215 220 



35 



GTC TTT GAC CGC CTG GAT GCT GAC ATC GCC CAT GCG CTG ATG AGC ATC 1202 
Val Phe Asp Arg Leu Asp Ala Asp He Ala His Ala Leu Met Ser He 
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225 230 235 
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10 



AAC GCG GTG AAA GGC GTG GAA ATT GGC GAC GGC TTT 6AC GIG GTG GCG 1250 
Asn Ala Val Lys Gly Val Glu He Gly Asp Gly Phe Asp Val Val Ala 
240 245 250 

CTG CGC GGC AGC CAG AAC CGC GAT GAA ATC ACC AAA GAC GGT TTC CAG 1298 
Leu Arg Gly Ser Gin Asn Arg Asp Glu He Thr Lys Asp Gly Phe Gin 
255 260 265 

AGC AAC CAT GCG GGC GGC ATT CTC GGC GGT ATC AGC AGC GGG CAG CAA 1346 
Ser Asn His Ala Gly Gly He Leu Gly Gly He Ser Ser Gly Gin Gin 
270 275 280 285 

15 ATC ATT 6CC CAT ATG GCG CTG AAA CCG ACC TCC AGC ATT ACC GTG CCG 1394 
He He Ala His Met Ala Leu Lys Pro Thr Ser Ser He Thr Val Pro 
290 295 300 

GGT CGT ACC ATT AAC CGC TTT GGC GAA GAA GTT GAG ATG ATC ACC AAA 1442 
20 Gly Arg Thr He Asn Arg Phe Gly Glu Glu Val Glu Met He Thr Lys 
305 310 315 

GGC CGT CAC GAT CCC TGT GTC GGG ATC CGC 6CA GTG CCG ATC GCA GAA 1490 
Gly Arg His Asp Pro Cys Val Gly He Arg Ala Val Pro He Ala Glu 
25 320 325 330 

GCG AAT GCT GGC GAT CGT TTT AAT GGA TCA CCT GTT ACG GCA ACG GGC 1538 
Ala Asn Ala Gly Asp Arg Phe Asn Gly Ser Pro Val Thr Ala Thr Gly 
335 340 345 

30 

GCA AAA T6C C6A TGT GAA GAC TGA TATTCCACGC TGGTAAAAAA TGAATAAAAC 1592 
Ala Lys Cys Arg Cys Glu Asp * 
350 355 

35 CGCGATTGCG CTGCTGGCTC TGCTTGCCAG TAGCGCCA6C CTGGCAGCGA C6CCGTGGCA 1652 
AAAAATAACC CAACCTGTGC CGGGTA6CGC CAAATCGA 1690 



^ PCT/GB99/00935 
WO 99/49026 ^ 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 356 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Gly Asn Thr lie Gly Gin Leu Phe Arg Val Thr Thr Phe Gly 
1 5 10 15 

15 Glu Ser His Gly Leu Ala Leu Gly Cys He Val Asp Gly Val Pro Pro 
20 25 30 

Gly He Pro Leu Thr Glu Ala Asp Leu Gin His Asp Leu Asp Arg Arg 
35 40 45 

20 

Arg Pro Gly Thr Ser Arg Tyr Thr Thr Gin Arg Arg Glu Pro Asp Gin 
50 55 60 

Val Lys He Leu Ser Gly Val Phe Glu Gly Val Thr Thr Gly Thr Ser 
25 65 70 75 80 

He Gly Leu Leu He Glu Asn Thr Asp Gin Arg Ser Gin Asp Tyr Ser 
85 90 95 

30 Ala He Lys Asp Val Phe Arg Pro Gly His Ala Asp Tyr Thr Tyr Glu 
100 105 110 

Gin Lys Tyr Gly Leu Arg Asp Tyr Arg Gly Gly Gly Arg Ser Ser Ala 
115 120 125 

35 

Arg Glu Thr Ala Met Arg Val Ala Ala Gly Ala He Ala Lys Lys Tyr 
130 135 140 
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VTV 

Leu Ala Glu Lys Phe Gly He Glu He Arg Gly Cys Leu Thr Gin Met 
145 150 155 160 

Gly Asp He Pro Leu Asp He Lys Asp Trp Ser Gin Val Glu Gin Asn 
5 165 170 175 

Pro Phe Phe Cys Pro Asp Pro Asp Lys He Asp Ala Leu Asp Glu Leu 
180 185 190 

10 Met Arg Ala Leu Lys Lys Glu Gly Asp Ser He Gly Ala Lys Val Thr 
195 200 205 

Val Val Ala Ser Gly Val Pro Ala Gly Leu Gly Glu Pro Val Phe Asp 
210 215 220 

15 

Arg Leu Asp Ala Asp He Ala His Ala Leu Met Ser He Asn Ala Val 
225 230 235 240 

Lys Gly Val Glu He Gly Asp Gly Phe Asp Val Val Ala Leu Arg Gly 
20 245 250 255 

Ser Gin Asn Arg Asp Glu He Thr Lys Asp Gly Phe Gin Ser Asn His 
260 265 270 

25 Ala Gly Gly He Leu Gly Gly He Ser Ser Gly Gin Gin He He Ala 
275 280 285 

His Met Ala Leu Lys Pro Thr Ser Ser He Thr Val Pro Gly Arg Thr 
290 295 300 

30 

He Asn Arg Phe Gly Glu Glu Val Glu Met He Thr Lys Gly Arg His 
305 310 315 320 



35 Asp Pro Cys Val Gly He Arg Ala Val Pro He Ala Glu Ala Asn Ala 
325 330 335 
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Gly Asp Arg Phe Asn Gly Ser Pro Val Thr Ala Thr Gly Ala Lys Cys 
340 345 350 

Arg Cys Glu Asp * 

5 355 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1713 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: ompC of E.coli 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 491.. 1594 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTTAACAAGC GTTATAGTTT TTCTGTGGTA GCACAGAATA ATGAAAAGTG TGTAAAGAAG 60 

GGTAAAAAAA ACCGAATGCG AGGCATCCGG TTGAAATAGG GGTAAACAGA CATTCAGAAA 120 

30 

T6AATGACGG TAATAAATAA AGTTAATGAT GATAGCGGGA GTTATTCTAG TTGCGAGTGA 180 

AGGTTTTGTT TTGACATTCA GTGCTGTCAA ATACTTAAGA ATAAGTTATT GATTTTAACC 240 

35 TTGAATTATT ATTGCTTGAT GTTAGGTGCT TATTTCGCCA TTCCGCAATA ATCTTAAAAA 300 

GTTCCCTTGC ATTTACATTT TGAAACATCT ATAGCGATAA AT6AAACATC TTAAAAGTTT 360 



10 



15 



20 
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TAGTATCATA TTCGTGTTGG AHATTCTGC ATTTTTGGGG AGAATGGACT TGCCGACTGA 420 
TTAATGAGGG TTAATCAGTA TGCAGTGGCA TAAAAAAGCA AATAAAGGCA TATAACAGAG 480 



25 



AAA GGT GAA ACT CAG 6TT ACT GAC CAG CTG ACC GGT TAG GGC CAG T6G 
Lys Gly Glu Thr Gin Val Thr Asp Gin Leu Thr 61 y Tyr Gly Gin Trp 
420 425 430 

GAA TAT CAG ATC CAG GGC AAC AGC GCT GAA AAC GAA AAC AAC TCC TGG 
Glu Tyr Gin He Gin Gly Asn Ser Ala Glu Asn Glu Asn Asn Ser Trp 
435 440 445 450 

30 ACC CGT GTG GCA TTC GCA GGT CTG AAA TTC CAG GAT GT6 GGT TCT TTC 
Thr Arg Val Ala Phe Ala Gly Leu Lys Phe Gin Asp Val Gly Ser Phe 
455 460 465 

GAC TAC GGT CGT AAC TAC GGC GTT GTT TAT GAC GTA ACT TCC TGG ACC 
3 5 Asp Tyr Gly Arg Asn Tyr Gly Val Val Tyr Asp Val Thr Ser Trp Thr 
470 475 480 



529 



577 



5 

GGTTAATAAC ATG AAA GH AAA GTA CTG TCC CTC CTG GTC CCA GCT CTG 
Met Lys Val Lys Val Leu Ser Leu Leu Val Pro Ala Leu 
360 365 370 

1 0 CTG GTA GCA GGC GCA GCA AAC GCT GCT GAA GH TAC AAC AAA GAC GGC 
Leu Val Ala Gly Ala Ala Asn Ala Ala Glu Val Tyr Asn Lys Asp Gly 
375 380 385 

a AAC AAA TTA GAT CTG TAC GGT AAA GTA GAC GGC CTG CAC TAT TTC TCT 625 

I 15 Asn Lys Leu Asp Leu Tyr Gly Lys Val Asp Gly Leu His Tyr Phe Ser 
% 390 395 400 

S GAC AAC AAA GAT GTA GAT GGC GAC CAG ACC TAC ATG CGT CTT GGC TTC 673 

II Asp Asn Lys Asp Val Asp Gly Asp Gin Thr Tyr Met Arg Leu Gly Phe 
2 0 405 410 415 



721 



769 



817 



865 
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GAC GTA CTG CCA GAA TTC GGT GGT GAG ACC TAC GGT TOT GAC AAC TTC 913 
Asp Val Leu Pro Glu Phe Gly Gly Asp Thr Tyr Gly Ser Asp Asn Phe 
485 490 495 

5 ATG CAG CAG CGT GGT AAC GGC TTC GCG ACC TAC CGT AAC ACT GAC TTC 961 
Met Gin Gin Arg Gly Asn Gly Phe Ala Thr Tyr Arg Asn Thr Asp Phe 
500 505 510 

TTC GGT CTG GTT GAC GGC CTG AAC TTT GCT GTT CAG TAC CAG GGT AAA 1009 
10 Phe Gly Leu Val Asp Gly Leu Asn Phe Ala Val Gin Tyr Gin Gly Lys 
515 520 525 530 

AAC GGC AAC CCA TCT GGT GAA GGC TTT ACT AGT GGC GTA ACT AAC AAC 1057 
Asn Gly Asn Pro Ser Gly Glu Gly Phe Thr Ser Gly Val Thr Asn Asn 
15 535 540 545 

GGT CGT GAC GCA CTG CGT CAA AAC GGC GAC GGC GTC GGC GGT TCT ATC 1105 
Gly Arg Asp Ala Leu Arg Gin Asn Gly Asp Gly Val Gly Gly Ser He 
550 555 560 

20 

ACT TAT GAT TAC GAA GGT TTC GGT ATC GGT GGT GCG ATC TCC AGC TCC 1153 
Thr Tyr Asp Tyr Glu Gly Phe Gly He Gly Gly Ala He Ser Ser Ser 
565 570 575 

2 5 AAA CGT ACT GAT GCT CAG AAC ACC GCT GCT TAC ATC GGT AAC GGC GAC 1201 
Lys Arg Thr Asp Ala Gin Asn Thr Ala Ala Tyr He Gly Asn Gly Asp 
580 585 590 

CGT GCT GAA ACC TAC ACT GGT GGT CTG AAA TAC GAC GCT AAC AAC ATC 1249 
30 Arg Ala Glu Thr Tyr Thr Gly Gly Leu Lys Tyr Asp Ala Asn Asn He 
595 600 605 610 

TAC CTG GCT GCJ CAG TAC ACC CAG ACC TAC AAC GCA ACT CGC GTA GGT 1297 
Tyr Leu Ala Ala Gin Tyr Thr Gin Thr Tyr Asn Ala Thr Arg Val Gly 
35 615 620 625 



TCC CTG GGT TGG GCG AAC AAA GCA CAG AAC TTC GAA GCT GTT GCT CAG 



1345 
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Ser Leu 61 y Trp Ala Asn Lys Ala Gin Asn Phe Glu Ala Val Ala Gin 
630 635 640 

TAC CAG TTC 6AC HC GGT CTG CGT CCG TCC CTG GCT TAC CTG CAG TCT 1393 
5 Tyr Gin Phe Asp Phe Gly Leu Arg Pro Ser Leu Ala Tyr Leu Gin Ser 
645 650 655 

AAA GGT AAA AAC CTG GGT CGT GGC TAC GAC GAC 6AA GAT ATC CTG AAA 1441 
Lys Gly Lys Asn Leu Gly Arg Gly Tyr Asp Asp Glu Asp He Leu Lys 
10 660 665 670 

TAT GTT GAT GTT GGT GCT ACC TAC TAC TTC AAC AAA AAC ATG TCC ACC 1489 
Tyr Val Asp Val Gly Ala Thr Tyr Tyr Phe Asn Lys Asn Met Ser Thr 
675 680 685 690 

15 

TAC GTT GAC TAC AAA ATC AAC CTG CTG GAC GAC AAC CAG TTC ACT CGT 1537 
Tyr Val Asp Tyr Lys He Asn Leu Leu Asp Asp Asn Gin Phe Thr Arg 
695 700 705 

20 GAC GCT GGC ATC AAC ACT GAT AAC ATC GTA GCT CTG GGT CTG GTT TAC 1585 
Asp Ala Gly He Asn Thr Asp Asn He Val Ala Leu Gly Leu Val Tyr 
710 715 720 

CAG TTC TAA TCTC6ATTGA TATC6AACAA 6GGCCTGCGG GCCCTTTTTT 1634 

25 Gin Phe * 
725 

CATT6TTTTC AGCGTACAAA CTCAGTTTTT TGGTGTACTC TT6C6ACCGT TCGCAT6AGG 1694 



30 ATAATCACGT ACG6AAATA 



(2) INFORMATION FOR SEQ ID NO: 4: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 367 amino acids 

(B) TYPE: amino acid 



1713 
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(0) TOPOLOGY: linear 



(ii) MOLECULE ryPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

5 

Met Lys Val Lys Val Leu Ser Leu Leu Val Pro Ala Leu Leu Val Ala 
Gly Ala Ala Asn Ala Ala Glu Val Tyr Asn Lys Asp Gly Asn Lys Leu 

10 



20 25 30 



Asp Leu Tyr Gly Lys Val Asp Gly Leu His Tyr Phe Ser Asp Asn Lys 
35 40 45 

15 Asp Val Asp Gly Asp Gin Thr Tyr Met Arg Leu Gly Phe Lys Gly Glu 
50 55 60 



Thr Gin Val Thr Asp Gin Leu Thr Gly Tyr Gly Gin Trp Glu Tyr Gin 
65 

20 



70 75 80 



He Gin Gly Asn Ser Ala Glu Asn Glu Asn Asn Ser Trp Thr Arg Val 
85 



90 95 



Ala Phe Ala Gly Leu Lys Phe Gin Asp Val Gly Ser Phe Asp Tyr Gly 
25 100 105 

Arg Asn Tyr Gly Val Val Tyr Asp Val Tlir Ser Trp Thr Asp Val Leu 
115 120 125 



30 Pro Glu Phe Gly Gly Asp Thr Tyr Gly Ser Asp Asn Phe Met Gin Gin 
130 135 140 

Arg Gly Asn Gly Phe Ala Thr Tyr Arg Asn Thr Asp Phe Phe Gly Leu 

1 cn 155 160 

145 150 

35 

Val Asp Gly Leu Asn Phe Ala Val Gin Tyr Gin Gly Lys Asn Gly Asn 
165 170 175 

-11- 
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Pro Ser Gly Glu Gly Phe Thr Ser Gly Val Thr Asn Asn Gly Arg Asp 
180 185 190 

Ala Leu Arg Gin Asn Gly Asp Gly Val Gly Gly Ser He Thr Tyr Asp 
5 195 200 205 

Tyr Glu Gly Phe Gly lie Gly Gly Ala He Ser Ser Ser Lys Arg Thr 
210 215 220 

10 Asp Ala Gin Asn Thr Ala Ala Tyr He Gly Asn Gly Asp Arg Ala Glu 
225 230 235 240 

Thr Tyr Thr Gly Gly Leu Lys Tyr Asp Ala Asn Asn He Tyr Leu Ala 
245 250 255 

15 

Ala Gin Tyr Thr Gin Thr Tyr Asn Ala Thr Arg Val Gly Ser Leu Gly 
260 265 270 

Trp Ala Asn Lys Ala Gin Asn Phe Glu Ala Val Ala Gin Tyr Gin Phe 
2 0 275 280 285 

Asp Phe Gly Leu Arg Pro Ser Leu Ala Tyr Leu Gin Ser Lys Gly Lys 
290 295 300 

2 5 Asn Leu Gly Arg Gly Tyr Asp Asp Glu Asp He Leu Lys Tyr Val Asp 

305 310 315 320 

Val Gly Ala Thr Tyr Tyr Phe Asn Lys Asn Met Ser Thr Tyr Val Asp 
325 330 335 

30 

Tyr Lys He Asn Leu Leu Asp Asp Asn Gin Phe Thr Arg Asp Ala Gly 
340 345 350 

He Asn Thr Asp Asn He Val Ala Leu Gly Leu Val Tyr Gin Phe * 

3 5 355 360 365 
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10 



:52 15 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1808 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: ompF of E.coli 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 457.. 1545 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AAAACTAATC CGCATTCTTA TTGCGGATTA Gi 1 1 1 1 ICTT A6CTAATAGC ACAATTTTCA 60 

TACTATTTTT TGGCATTCTG GATGTCTGAA A6AAGATTTT GTGCCAGGTC 6ATAAAGTTT 120 

CCATCAGAAA CAAAATTTCC GTTTAGTTAA TTTAAATATA AGGAAATCAT ATAAATAGAT 180 

TAAAATTGCT GTAAATATCA TCACGTCTCT ATGGAAATAT GACGGTGTTC ACAAAGTTCC 240 

TTAAATTTTA CTTTTGGTTA CATAI 1 1 1 1 1 CTTTTTGAAA CCAAATCTTT ATCTTTGTA6 300 

CACTTTCACG GTAGCGAAAC 6TTAGTTTGA ATGGAAA6AT GCCTGCAGAC ACATAAAGAC 360 

ACCAAACTCT CATCAATAGT TCCGTAAATT TTTATTGACA GAACTTATTG ACGGCAGTGG 420 

CAGGTGTCAT AAAAAAAACC ATGAGGGTAA TAAATA ATG ATG AAG CGC AAT ATT 474 



Met Met Lys Arg Asn lie 



1 



5 
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CTG GCA GTG ATC GTC CCT GCT CTG TTA GTA GCA GGT ACT GCA AAC GCT 
Leu Ala Val He Val Pro Ala Leu Leu Val Ala Gly Thr Ala Asn Ala 



PCT/GB99/00935 

522 



10 



15 20 



5 GCA 6AA ATC TAT AAC AAA GAT GGC AAC AAA GTA GAT CTG TAC GGT AAA 
Ala Glu lie Tyr Asn Lys Asp Gly Asn Lys Val Asp Leu Tyr Gly Lys 
25 30 35 

GCT GTT GGT CTG CAT TAT TTT TCC AAG GGT AAC GGT GAA AAC AGT TAC 
10 Ala Val Gly Leu His Tyr Phe Ser Lys Gly Asn Gly Glu Asn Sen Tyr 
40 45 50 

GGT GGC AAT GGC GAC ATG ACC TAT GCC CGT CTT GGT TH AAA GGG GAA 
Gly Gly Asn Gly Asp Met Thr Tyr Ala Arg Leu Gly Phe Lys Gly Gl^u 
15 55 



60 65 70 



20 



ACT CAA ATC AAT TCC GAT CTG ACC GGT TAT GGT CAG TGG GAA TAT AAC 
Thr Gin He Asn Ser Asp Leu Thr Gly Tyr Gly Gin Trp Glu'Tyr Asn 
75 80 85 

TTC CAG GGT AAC AAC TCT GAA GGC GCT GAC GCT CAA ACT GGT AAC AAA 
Phe Gin Gly Asn Asn Ser Glu Gly Ala Asp Ala Gin Thr Gly Asn Lys 
90 95 100 

25 AC6 CGT CTG GCA TTC GC6 GGT CTT AAA TAC GCT GAC GTT GGT TCT TTC 
Thr Arg Leu Ala Phe Ala Gly Leu Lys Tyr Ala Asp Val Gly Ser Phe 
105 110 115 

GAT TAC GGC CGT AAC TAC GGT GTG GTT TAT GAT GCA CTG GGT TAC ACC 
30 Asp Tyr Gly Arg Asn Tyr Gly Val Val Tyr Asp Ala Leu Gly Tyr Thr 



120 



125 130 



6AT ATG CTG CCA GAA TH GGT GGT GAT ACT GCA TAC AGC GAT GAC TTC 
Asp Met Leu Pro Glu Phe Gly Gly Asp Thr Ala Tyr Ser Asp Asp Phe 

35 135 



140 145 150 



TTC GTT GGT CGT GTT GGC GGC GTT GCT ACC TAT CGT AAC TCC AAC TTC 



570 



618 



666 



714 



762 



810 



858 



906 



954 
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15 



1002 



Phe Val Gly Arg Val Gly Gly Val Ala Thr Tyr Arg Asn Ser Asn Phe 
155 160 165 

TTT GGT CTG GTT GAT GGC CTG AAC TTC GCT GH CAG TAC CTG GGT AAA 
5 Phe Gly Leu Val Asp Gly Leu Asn Phe Ala Val Gin Tyr Leu Gly Lys 
170 175 180 

AAC GAG CGT GAC ACT GCA CGC CGT TOT AAC GGC GAC GGT GTT GGC GGT 
Asn Glu Arg Asp Thr Ala Arg Arg Ser Asn Gly Asp Gly Val Gly Gly 
10 185 190 195 



TCT ATC AGC TAC GAA TAC GAA GGC TTT GGT ATC GTT GGT GCT TAT GGT 1098 
Ser He Ser Tyr Glu Tyr Glu Gly Phe Gly He Val Gly Ala Tyr Gly 
200 205 210 



1050 



GCA GCT GAC CGT ACC AAC CTG CAA GAA GCT CAA CCT CTT GGC AAC GGT 1146 
Ala Ala Asp Arg Thr Asn Leu Gin Glu Ala Gin Pro Leu Gly Asn Gly 
215 220 225 230 

20 AAA AAA GCT GAA CAG TGG GCT ACT GGT CTG AAG TAC GAC GCG AAC AAC 1194 
Lys Lys Ala Glu Gin Trp Ala Thr Gly Leu Lys Tyr Asp Ala Asn Asn 
235 240 245 

ATC TAC CTG GCA GCG AAC TAC GGT GAA ACC CGT AAC GCT ACG GCG ATC 1242 
25 He Tyr Leu Ala Ala Asn Tyr Gly Glu Thr Arg Asn Ala Thr Pro He 
250 255 260 

ACT AAT AAA TTT ACA AAC ACC AGC GGC TTC GCC AAC AAA ACG CAA GAC 1290 
Thr Asn Lys Phe Thr Asn Thr Ser Gly Phe Ala Asn Lys Thr Gin Asp 
30 265 270 275 

GTT CTG TTA GTT GCG CAA TAC CAG TTC GAT TTC GGT CTG CGT GCG TCC 1338 
Val Leu Leu Val Ala Gin Tyr Gin Phe Asp Phe Gly Leu Arg Pro Ser 
280 285 290 

35 

ATC GCT TAC ACC AAA TCT AAA GCG AAA GAC GTA GAA GGT ATC GGT GAT 1386 
lie Ala Tyr Thr Lys Ser Lys Ala Lys Asp Val Glu Gly He Gly Asp 
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295 300 305 310 

GTT GAT CTG GTG AAC TAG TTT 6AA GTG GGC GCA AGO TAG TAG TTG AAC 1434 
Val Asp Leu Val Asn Tyr Phe Glu Val Gly Ala Thr Tyr Tyr Phe Asn 
5 315 320 325 

AAA AAG ATG TCC ACC TAT GTT GAC TAG ATC ATC AAC GAG ATG GAT TCT 1482 
Lys Asn Met Ser Thr Tyr Val Asp Tyr He He Asn Gin He Asp Ser 
330 335 340 

10 

GAC AAC AAA CTG GGC GTA GGT TCA GAC GAC ACC GTT GCT GTG GGT ATC 1530 
Asp Asn Lys Leu Gly Val Gly Ser Asp Asp Thr Val Ala Val Gly He 
345 350 355 

15 GTT TAG CAG TTC TAA TAGCACACCT CTTTGTTAAA TGCCGAAAAA ACAGGACTTT 1585 
Val Tyr Gl n Phe * 
350 

GGTCCTGTTT TTTTTATACC TTCCAGAGCA ATCTCACGTC TTGCAAAAAC AGCCTGCGTT 1645 

20 

TTCATCAGTA ATAGTTGGAA mTGTAAAT CTCCC6TTAC CCTGATAGCG GACTTCCCTT 1705 
CTGTAACCAT AATGGAACCT CGTCATGTTT GAGAACATTA CCGGCGCTCC TGCCGACCCG 1765 
25 ATTCTGG6CC TGGCCGATCT GTTTCGTGCC GAT6AACGTC CCG 1808 



(2) INFORMATION FOR SEQ ID NO: 6: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids. 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
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Met Met Lys Arg Asn He Leu Ala Val He Val Pro Ala Leu Leu Val 
1 5 10 15 

Ala Gly Thr Ala Asn Ala Ala Glu He Tyr Asn Lys Asp Gly Asn Lys 

5 20 25 30 

Val Asp Leu Tyr Gly Lys Ala Val Gly Leu His Tyr Phe Ser Lys Gly 
35 40 45 

10 Asn Gly Glu Asn Ser Tyr Gly Gly Asn Gly Asp Met Thr Tyr Ala Arg 
50 55 60 

Leu Gly Phe Lys Gly Glu Thr Gin He Asn Ser Asp Leu Thr Gly Tyr 
65 70 75 80 

15 

Gly Gin Trp Glu Tyr Asn Phe Gin Gly Asn Asn Ser Glu Gly Ala Asp 
85 90 95 

Ala Gin Thr Gly Asn Lys Thr Arg Leu Ala Phe Ala Gly Leu Lys Tyr 



20 



100 105 110 

Ala Asp Val Gly Ser Phe Asp Tyr Gly Arg Asn Tyr Gly Val Val Tyr 
115 120 125 



2 5 Asp Ala Leu Gly Tyr Thr Asp Met Leu Pro Glu Phe Gly Gly Asp Thr 

130 135 140 

Ala Tyr Ser Asp Asp Phe Phe Val Gly Arg Val Gly Gly Val Ala Thr 
145 150 155 160 

30 

Tyr Arg Asn Ser Asn Phe Phe Gly Leu Val Asp Gly Leu Asn Phe Ala 
165 170 175 

Val Gin Tyr Leu Gly Lys Asn Glu Arg Asp Thr Ala Arg Arg Ser Asn 

3 5 180 185 190 

Gly Asp Gly Val Gly Gly Ser He Ser Tyr Glu Tyr Glu Gly Phe Gly 
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195 



200 205 



ne Val Gly Ala Tyr Gly Ala Ala Asp Arg Thr Asn Leu Gin Glu Ala 
210 215 220 

5 

Gin Pro Leu Gly Asn Gly Lys Lys Ala Glu Gin Trp Ala Thr Gly Leu 
225 230 235 240 

Lys Tyr Asp Ala Asn Asn He Tyr Leu Ala Ala Asn Tyr Gly Glu Thr 
10 245 250 255 

Arg Asn Ala Thr Pro He Thr Asn Lys Phe Thr Asn Thr Ser Gly Phe 
260 265 270 

15 Ala Asn Lys Thr Gin Asp Val Leu Leu Val Ala Gin Tyr Gin Phe Asp 
275 280 285 

Phe Gly Leu Arg Pro Ser He Ala Tyr Thr Lys Ser Lys Ala Lys Asp 
290 295 300 

20 

Val Glu Gly He Gly Asp Val Asp Leu Val Asn Tyr Phe Glu Val Gly 
305 310 315 320 

Ala Thr Tyr Tyr Phe Asn Lys Asn Met Ser Thr Tyr Val Asp Tyr He 
25 325 330 335 

He Asn Gin He Asp Ser Asp Asn Lys Leu Gly Val Gly Ser Asp Asp 
340 345 350 

30 Thr Val Ala Val Gly He Val Tyr Gin Phe * 
355 360 
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